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Abstract 

In this paper, we are interested in heuristic parameter choice rules for 
general convex variational regularization which are based on error esti- 
mates. Two such rules are derived and generalize those from quadratic 
regularization, namely the Hanke-Raus rule and quasi-optimality crite- 
rion. A posteriori error estimates are shown for the Hanke-Raus rule, and 
convergence for both rules is also discussed. Numerical results for both 
rules are presented to illustrate their applicability. 

1 Introduction 

We consider the ill-posed problem of determining a solution x to 

Kx = y 5 , (1) 

when only a noisy version y s of the exact data y^ is available, which furthermore 
satisfies an inequality \\y s — y^\\ < S. In our setting K : X — > Y is a bounded 
and linear operator mapping from a Banach space X into a Hilbert space Y. 

As usual for inverse problems, the numerical solution of problem suffers 
from ill-posedness. In particular, a small change in the data y s can lead to 
an enormous deviation of the solution x. To combat the inherent instability, 
regularization has been established as an effective approach since the pioneer- 
ing work of Tikhonov |35j . The regularization method under consideration is 
general convex Tikhonov regularization, i.e., for a convex and (weak) lower semi- 
continuous functional R : X — > [0, oo], we seek a minimizer, denoted by x s a , of 
the functional 

J a (x) = l\\Kx-y 6 f + aR(x), (2) 

and takes the minimizer x s a as an approximate solution to the unknown ex- 
act solution x'. Here R is the regularization functional incorporating a priori 
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information, and a is known as the regularization parameter, determining the 
tradeoff between the data fitting term and the regularization term. 

Tikhonov regularization formulations of this form have attracted consider- 
able interest in recent years, and have found applications in diverse disciplines, 
e.g., imaging science [33J E2 an d signal processing [T51 HD] ■ Because of their 
immense practical importance, the functional J a has been the subject of many 
recent investigations. Theoretically speaking, since the pioneering work [5J, 
convergence and convergence rates under a variety of conditions have been es- 
tablished [321 123 [30l EOj. Numerically, several efficient algorithms have also 
been proposed [TT 1 I2T ] [35] . 

But one of the most important questions in applying these techniques to 
practical problems, i.e., choosing an appropriate regularization parameter a, 
remains largely underexplored. While the problem of parameter choice has 
been discussed in depth for the conventional quadratic regularization, see e.g., 
[18] for theoretical studies and [231 137] for details on numerical implementation, 
the case of general convex regularization has scarcely been addressed. As to 
existing studies on parameter selection for Tikhonov regularization in Banach 
space, we are aware of Morozov's discrepancy principle [3T] , which was recently 
investigated [3j[28]. Some theoretical results, e.g., convergence and convergence 
rates, were derived. In the latter work, an algorithm for solving the discrepancy 
equation was also proposed. However, the discrepancy principle requires an 
estimate of the noise level, which is not always available. Therefore, there is 
a significant interest in deriving heuristic choice rules which do not require a 
knowledge of the exact noise level and still allow some theoretical justification. 
One such rule is due to the authors [21], where existence of a solution and 
a posteriori error estimates are derived. Another is the balancing principle, 
recently derived using the model function approach in [T3J, for a model with L 1 
data fitting and quadratic regularization. 

In the present study, we shall derive two heuristic choice rules based on 
error estimates, which are achieved by a refined analysis of regularization pro- 
cess. Error estimate-based heuristic choice rules are well-known for the conven- 
tional quadratic regularization [TSJ, but to the best of the authors' knowledge, 
there is no known rule of this type for general convex variational regularization. 
The derived rules generalize Hanke-Raus rule and quasi-optimality criterion for 
quadratic regularization to general convex regularization. Some theoretical jus- 
tifications, e.g., existence, a posteriori error estimate and convergence, of both 
rules are provided. Numerical results are presented to validate some theoretical 
findings and to illustrate the features of both rules. 

Notation: The linear operator K : X —> Y from a Banach space X into a 
Hilbert space Y is assumed to be bounded; K* : Y —> X* denotes its adjoint 
operator. We assume that the exact data is attainable, i.e., yt € ranged, 
and the noisy data y s satisfies \\y^ —y 5 \\ < S. The functional R : X — > [0, oo] 
is assumed to be proper, convex, weakly lower semicontinuous and coercive. 
This conditions ensure that the functional J a defined in ([2| possess minimizers 
(cf. [2S]). We shall denote by x & a a minimizer to the functional J ai and by x a 
a corresponding minimizer for exact data y\ i.e., 

x a e argmin { | \\Kx — y' || 2 + aR(x)\ . 
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By x^ we denote a minimum- R solution of the equation Kx = y^ (see e.g. [25]). 
With dR(x) we denote the subdiffcrential of a convex functional R at x [T7] , 
Throughout the paper we assume that the exact solution x^ fulfills the following 
source condition (see [8]): 

3w :K*w€ dR(x^). (3) 

For any £ € dR(x), we denote the Bregman distance from a; to x' with respect 
to £ with 

D s (x', x) = R(x') - R(x) - (£, x 1 - x). 

We note that the Bregman distance D^(x' , x) is always nonnegative, although in 
general it can vanish for distinct x' and x. Bregman distance provides a natural 
measure of various errors, and for a detailed discussion, we refer to [9]. 



2 Estimates for different errors 

In the case of regularization in Hilbert spaces, one usually splits the total error, 
i.e., the distance from x s a to x\ into the approximation error and the data error, 
which refer to the distance from x a to x* and that from x s a to x a , respectively, 
cf . [TS] • This is achieved with the help of a triangle inequality. Then the approx- 
imation error and the data error are estimated separately to get an estimate for 
the total error. Theoretically, the behavior of the approximation error contains 
information about how difficult it is to approximate the unknown solution x> 
and provides hints on what conditions on x' may be helpful. The behavior of 
the data error shows how noise influences the accuracy of the reconstruction. 

In the case of convex variational regularization one usually estimates the 
total error directly. One reason is that in this setting the natural distance 
measure for the errors is the Bregman distance which does not fulfill the triangle 
inequality. In this section we provide estimates for different terms. This sheds 
insights in the regularization process and thereby shows that a splitting into 
approximation and data error is still useful. 

Proposition 2.1. Let x^ fulfill the source condition ^ with £ = K*w. Then 
the approximation error and the corresponding discrepancy satisfy 

D^x a ,x^)<^f-a, (4) 
\\Kx a -yt\\<2\\w\\a. (5) 

With the choice £ Q = —K*(Kx a — y')/a, the data error and the corresponding 
discrepancy satisfy 

D ia (xi,x a )<~, (6) 

\\K{x & a -x a )\\ <25. (7) 

Proof. Inequalities Q and ^ have been shown in [8j, however, we include a 
short proof for the sake of completeness. By the minimizing property of x a and 
the fact Kx' = y^ we have 

\\\Kx a - y 1 !! 2 + aR(x a ) < aR^). 
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Rearranging the terms and noting £ £ dR(x^) yields 

\\\Kx a - tff + aD^x^x^) < x Q - x f ) . (8) 

By observing the non-negativity of the Bregman distance D^(x a , x*) and using 
£ = K*w and Cauchy-Schwarz inequality, we obtain 

\\\Kx a -tff <a\\w\\\\Kx a -yt\\ 

which shows estimate 

Appealing again to inequality ^ and using £ = K*w, Cauchy-Schwarz and 
Young's inequalities, we arrive at 

\\\Kx a -rf\\ 2 + aD^x a ,x^) < + \\\Kx a - ytf. 

This establishes estimate Q 



Next we use the minimizing property of x a to get 

Kx 5 a -y 5 \\ 2 + aD £a (x s a ,x a ) < \\\Kx a - y 5 f - a(£ a , x s a - x a ). 



From the optimality of x a , we have £ a = —K*{Kx a — y^)/a £ dR(x a ). Plugging 
in £ Q and rearranging the formula gives 

\\\Kx s a - y 5 \\ 2 +aD Sa (x s a ,x a ) < \\\Kx a - y 5 \\ 2 + (Kx a - y\K(x s a - x a )) 

= i\\Kx s a -y s \\ 2 -±\\K(x s a -x a )f 
-(yt- y s,K(x 5 a -x a )), 



i.e., 

l\\K(x 5 a - x a )\\ 2 + aDt(x 5 a ,x a ) < -(j/t - y 5 , K(x s a - x a )) . (9) 

Now the non-negativity of the Bregman distance and Cauchy-Schwarz inequality 
yields estimate Q. Next by virtue of inequality (J9j) and Cauchy-Schwarz and 
Young's inequalities, we obtain 

^\\K(x s a - x a )\\ 2 + aD^xi,x a ) < 5\\K(x s a -x a )\\ < f + ±\\K(x s a ~ x a )\\ 2 

which concludes the proof. □ 

From [5] we cite the following result. 

Proposition 2.2 (Estimate for the total error). // the source condition Q 
holds with £ = K*w £ dR{x^), then we have 

^(4^ f )<^(^ + V«IIHl) 2 , (io) 

\\Kx 5 a -y 5 \\ < 8 + 2a|MI- (11) 

Although the Bregman distance does in general not fulfill the triangle in- 
equality we see that the total error D{x s al x^) behaves like the sum of the ap- 
proximation error D(x a ,x 1f ) and the data error D(x s al x a ). Indeed there holds 
for a, b > that (a + b) 2 /2 < a 2 + b 2 < (a + b) 2 and hence, we see that the 
estimate (10) behaves like the sum of the estimates Q and ([6|. 

The connection between the total error in the Bregman distance and the 
approximation and data errors can be made a bit more precise. To this end, we 
utilize the following lemma which is an immediate consequence of the definition 
of the Bregman distance: 
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Lemma 2.3. Let £ € dR{x^) and ( 6 dR(x). Then there holds for any x' that 
D s (x',x^) = D c {x',x) + D^x,x^) +{£-C,x-x'}. 
The next result is a consequence of Lemma|2.3[ and will be used frequently. 



Corollary 2.4. Let the source condition ([3| be fulfilled. Then with the obvious 
choices of the respective subgradients, there holds 

D(x s a ,x*) - (D(x s a ,x a )+D(x a ,x^)) < 6\\w\\5. 

Proof. Taking x = x a , x' = x s a , £ = K*w and £ a = —K* (Kx a — y^)/a in 
Lemma |2.3| gives 



D(x s a , x ] ) = D(x s a ,x a ) + D(x a , art) + ( w + (Kx a - tf)/a, K(x a - x 5 a )) , 

which together with inequalities ^ and ([7]) gives 

\{w + {Kx a -tf)/a,K(x a -x 5 a ))\ < (HI + \\Kx a -y*\\/a)\\K(x s a -x a )\\ 

< 6\\w\\S. 

This concludes the proof. □ 

Hence, the total error differs from the sum of approximation and data errors 
only by a term of magnitude 5. In general, the difference can be either positive 
or negative and both cases are observed in numerical experiments. 

We shall need the following result on the function a H-ft'a;* — y s \\- 

Lemma 2.5. The function a i— > — y s \\ is monotonically increasing and 

uniformly bounded. Moreover, if J a has a unique minimizer, then it is also 
continuous at a. 

Proof. Let x be an R- minimizing element in X. By the minimizing property of 
x 5 a , we have 

l\\Kxi-y s f+aR(x 5 a ) < \\\K~x - y 5 || 2 + aR(x), 

and thus < H-ft^a — y S \\ < \\Kx — y s \\ < +oo, and is uniformly bounded. The 
proof of the remaining assertion can be found in [3j [28] . □ 



The last result in this section gives an estimate for the distance between two 
regularized solutions for the same data but different regularization parameters. 
This estimate underlies the quasi-optimality principle in Section [4] 

Proposition 2.6. For q e}0, 1[ and = -K*(Kx s a - y s )/a there holds 

L>^{x qa ,x a ) < — . (12) 

Moreover, if the source condition ^ is fulfilled, then 

\\K(xi a -x' a )\\ < 2(1- q)(S + 2a\\w\\). (13) 
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Proof. The minimizing property of x qa implies 



2 II Kx qa 



,.S||2 



y-\r+qaR{x s qa ) < \\\Kx 5 a 
Rearranging the terms gives 



,<5||2 



qaR(x s a ). 



\Kx° 



y 5 \\ 2 +qaD ( s a (x ga , x 5 a ) < ±\\Kx 5 a - y 6 f+q(Kx 6 a - y\K{x\ a - x 5 a )} , 



which leads to 

qaD a (x qa ,x s a ) < -(1 - q)(Kx 5 a - y 5 , K(x s qa - x s a )) - \\\K{x 5 qa - a£)|| 



Appealing again to Cauchy-Schwarz and Young's inequalities gives (12). Using 
Cauchy-Schwarz inequality in 

§11*04 - 4)H 2 < -(1 - Q)(Kx s a - y s , K(x s qa - x* a )) , 



and noting estimate (11) shows the remaining assertion. 



□ 



3 A parameter choice a la Hanke-Raus 

In this section, we investigate a first heuristic parameter choice rule based on 
error estimate, which resembles a rule due to Hanke and Raus [22]. Although 
it is known that heuristic rules can never lead to regularization methods in the 
context of the classical worst-case scenario unless the problem is well-posed [T], 
they have proven applicable and useful in practice [23] . Recent results [53] show 
that weak assumptions on the true data y^ as well as the noisy data y s , hence 
leaving the worst-case scenario analysis, lead to provable error estimates. We 
shall establish a posteriori error estimates as well as convergence for the rule. 



3.1 Motivation 

We see from Proposition |2.2| that the estimate for the total error differs from 
that for the squared residual by a factor of 1/a: 

II* 3 * -"T < C + *»HI) 2 = fj_ + 2VS|H 

a a wa 



\[- 6 m + ^\\w\\) 2 >D{xlx^). 



since the value \\Kx s a -y s \\ 2 /a can be evaluated a posteriori without resorting 
to any knowledge of the exact noise level S, we propose to use it as an estimate 
of the total error and to choose an appropriate regularization parameter a by 
minimizing the function 

= \\Kx a y || 
a 

This resembles the parameter choice due to Hanke and Raus [32] for classical 
Tikhonov regularization as well as several iterative regularization methods. 



In view of Lemma 2.5 we see that \\m a ^ +00 (j)(a) — 0. Similarly, in case of a 
unique minimizer to the functional J a for any a, > 0, the optimization problem 
of minimizing <f> over any bounded and closed interval of the positive semi-axis 
R + is well-defined. 
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3.2 A posteriori error estimates 

In this part, we derive a posteriori error estimates for the Hanke-Raus rule 
to offer partial theoretical justification. We shall treat two cases of uniformly 
convex R and the particular case R(x) = \\x\\gi separately. 

Theorem 3.1. Let the source condition ^ be fulfilled. Let 4> be defined by (14) 
and a* defined as 

a* G argmin <j)(u). (15) 
ae[o,||K|| 2 ] 

If furthermore 8* :— \\Kx s a , — y s \\ ^ then there exists a constant C > such 
that 

D(x s a ,,x^ <c(l+(£) 2 )max(£,<f). 



Proof. We have from Corollary |2.4| 

D(x 5 a , a;*) < D(x a ,x*) + D(x 5 a ,x a ) + 6\\w\\5. 

It suffices to estimate the two Bregman distance terms. First we estimate the 
approximation error D(x a * , x') for a = a*. By inequalities (13 1 and (11), we 
obtain 

D(x a ,,X^) < \\w\\\\Kx a . -J/ f || 

<|H (\\K(x a ,-xi,)\\+\\Kx s a ,-y s \\+6) 
< \\w\\ {25 + 5* + 6) < 4|H| max((5, 5*). 

Next we estimate the data error D(x s a , , x a * ). Using inequality §6§, we get 



2a* \6*J 2a* 

By the definition of a* , we only increase the right hand side if we replace a* by 
any other a € [0, || 2 ]. We use a — cS with c = min(l, (5 _1 )||-ftr|| 2 and deduce 
from inequality (11) that 

\\Kxi~y 5 \\ < (l + 2c|MI)<5. 



Replacing a* by a in inequality (16), we have 

D(x s x ^fSy WKxj-r/r ( sy (i + 24w\\) 2 s 

By combining the above two estimates, we finally arrive at 

D(xi,,x^) < q W \\max{S,S*) + ^ + 2 ^ W ^ S + 6\\w\\S 

<C(l+(£) 2 )max(<W 
with C = max(10||w||, (1 + 2c||u;|| ) 2 /(2c)) as desired. □ 

The preceding result estimates the error in terms of the Bregman distance. 
In the case of p-convex regularization terms R (see e.g., [1]), this also provides 
error estimates in norm, i.e., \\x s a „ — x^\\. However, the interesting case of i 1 
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regularization, i.e., X — t 2 and R(x) — \\x\\ e i — J2k \ Xk \ 1S n °t covere d. In this 
case the Bregman distance is not even positive definite, i.e., D(x' , x) may vanish 
for distinct x' and x. However, by using techniques from [301 120] . we are still 
able to prove an analogous error estimate for this case. To this end, we recall 
the following result [2"U] . 

Lemma 3.2. Let X = I 2 and R(x) = J2 k I 

Xis\. Assume that the solution x* 
is finitely supported and satisfies the source condition Q). Moreover, assume 
that the operator K satisfies the finite basis injectivity property, that is, for any 
finitely support u and v, there holds that Ku — Kv implies u = v. Then there 
exist two positive constants C\ and c 2 such that 

\\x - x^y < cx[R{x) - R(x^)} + c 2 \\K(x - x^)\\. 



We are now ready to transfer Theorem 3.1 to the case R{x) — \\x\\ii. 

Theorem 3.3. Assume that the conditions in Lemma \3.S\ are satisfied. Let a* 
be chose according to (161. If furthermore 5* :— \\Kx d a * — y s \\ 7^ then there 
exists a constant C > such that 

W^-xi.y <c(i + 

Proof. By Lemma 



3.2 



the definition of Bregman distance D(x,x') and the 
source condition ([3 1, we have 

Wx-x^Wei < a[R{x) - R(x r )} + c 2 \\K( X — X' )\\ 

= ciD(x, x f ) +ci(£,x - x^) +c 2 \\K(x - x r )\\ 

= ciD{x,x^) + a{w,K(x - x f )> + c 2 \\K(x - x^)\\ 

< Cl D(x,x^) + {c 1 \\w\\+c 2 )\\K(x-x ii )\\ 

by Cauchy-Schwarz inequality. Now by virtue of Corollary |2.4| we have 

\\x s ar -x*y <c 1 p(xi,,x Q 0+^(^,x t )+6|| W ||<5)+(c 1 || U ;||+ C2 )||K(4,^)||. 

Next we bound each term on the right hand side. First observe 

\\K(x e a . - it)|| < \\Kx s a , - /|| + ||/ - Kx^\\ 
<S* +S< 2max((5, 5*). 

Then, for the approximation error D(x a * , x'), we obtain as before 

D(x a *,x*) < \\w\\ \\Kx a , - y\\ 

<\\w\\(\\K(x a ,-x s a *)\\+\\Kx 5 a ,-y s \\+6) 
< \\w\\ (2(5 + 5* + 5) < A\\w\\ max(,5, 5*). 

Finally, for the data error D(x 5 a , , x Q »), we obtain from inequality ^ and the 
definition of a* 

S 2 / (5 \ 2 \\Kx s - v 5 \\ 2 



2a* \S*J 2a 
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By the minimizing property of a* , replacing a* by any other a £ [0, ||^|| 2 ] only 
increases the right hand side. Setting a — c 3 6 £ [0, \\K || 2 ], then \\Kx 8 & — y s \\ 2 < 
C4S (cf. US]), which consequently gives 

z>(4.,*^<(£) 2 |U<|^)W,n. 

Combining these three estimates we arrive at the desired inequality with C = 
max(12c 1 ||w|| +2c 2) f^). □ 

As long as the discrepancy 6* is of order 6, Theorems |3.1| and |3.3| imply 



that the approximation x s a , with a* chosen by the rule (14) converges to the 
exact solution x' at the same rate as a priori parameter choice rules under 
identical source conditions [20]. On the other hand, if 6* does not decrease as 
quickly as 6, then the convergence would be suboptimal. More dangerous is the 
case that 5* decreases more quickly. Then the prefactor 6/6* blows up, and 
the approximation may diverge. Therefore, the value of 6* should always be 
monitored as an a posteriori criterion: The computed approximation should be 
discarded if 6* is deemed too small. 

3.3 Convergence 

By stipulating additional conditions on the data y s as in reference |22j . however, 
we can get rid of the prefactor in the estimates and even obtain convergence 
of the method. To show this, we denote by Q the orthogonal projection onto 
the orthogonal complement of the closure of range if. 

Corollary 3.4. If for the noisy data y s , there exists some e > such that 

\\Q(i/-y 5 )\\ >e\\yi-y s \\, 



then a* according to ( 15 ) is positive. Moreover, under the conditions of Theo- 
rem \3.1\ there holds 

D(x 5 a *,x*) < c(l + ^J max(<5,<T), 
and under the conditions of Theorem \3.3[ there holds 
Had. 



t|| < C\l + ^j max(6,6*), 



Proof. We observe 

\\Kx s a y s \\ > \\Q(K(x s a ) - y 5 )\\ = \\Qy 5 \\ = \\Q(y s - yt)|| > e || y « - y t||. (i 8 ) 

This shows 6* > e6 and especially that 0(a) — > +00 as a — > 0. Consequently, 
there exists a positive a* minimizing <f>(a) over [0, ||lf|| 2 ]. The remaining asser- 
tion follows from the preceding estimate and the respective error estimate. □ 

The next theorem shows the convergence of the rule under the condition 
that ||Q(yt 

— y S )\\ > — y S \\ holds uniformly for the data y 5 as 6 tends to 

zero. 
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Theorem 3.5. Assume that the functional J a is coercive and has a unique 
minimizer. Furthermore, in the situation of Theorem 3.1 let the assumption of 
Cor pilar y \3.J\ be fulfilled uniformly, i.e., there exists an e > such that for every 
5 > 0, the following inequality holds 

\\Q{y ] -y 5 )\\ >4^-y 5 l (is) 

Then there holds 

D(x s aHvS) ,x^^O for 6^0. 

Proof. By the definition of a* , we observe that the sequence (a* = a* {y 5 ))s>o 
is uniformly bounded and hence, there exists an accumulation point a. We 
distinguish the two cases a = and a > 0. 

We first consider the case a = 0. By Corollary |2.4[ we split the error 



D(x s a *,xi) <D(± s a .,x a .) + D(x a .,xt) + 6\\w\\5, (20) 

and estimate the data and approximation errors separately. 

For the data error D(x 6 a , , x a »), we deduce from inequality ( 18 1 and assump- 
tion ^ that 

D(x s x 5 2 \\K^-yT _^) 

Therefore, it suffices to show that <f>(a*) goes to zero as S — > 0. By Proposition 



2.2 



there holds for every a <E [0, ||^|| 2 ] that 
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(a*) < <f>(a) < (— +2|HIV«)' 



Hence, we may choose a(S) in the usual way such that a(S) — > and 5 2 /a(6) —> 
for S 0. This shows <j>(a*) — > for S — > 0. 

For the approximation error D(x a * ,x*), we deduce from the fact that a — 
and estimate Q that 



a"\\w\\ 2 allwP 



D{x a »,x')< > — - — = for o — > 0. 



Hence, all three terms on the right hand side of inequality ( 20 ) tend to zero for 
S — > as desired. 

Next we consider the remaining case a > 0. we use a* < \\K\\ 2 to get 

0( ) ^ pqp - °' 

Since (f>(a*) goes to zero for S — > we deduce that ||-Kx*» — y 5 \\ tends to zero 
as well. Next by the minimizing property of x s a » , we have 

l\\Kx s a . - y s f + a*R(x s a «) < \\\Kx^ - y S \\ 2 + a*R(x^). 

Therefore, both sequences (\\Kx s a , — y s \\)s and (R(x s a ,))$ are uniformly bounded 
by the assumption a > 0. By the coercivity of the functional J7q, the sequence 
{x 5 a ,)s is uniformly bounded, and thus there exists a subsequence, possibly 
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after relabeling as (x s a ,)g, that converges weakly to some x. By the weak lower 
semicontinuity of the norm and the functional R, we have 

\\Kx-rf\\ < liminf \\Kx s a , - y s \\ = 0, R{x) < liminf R{x s a »). (21) 

Consequently, for any x 

UKx-tff + aR(x) < liminf ill KxL - y s \\ 2 + liminf a*R(xi. ) 

2 ~ 8->0 2 6^0 

< liminf {\\\Kx s a ,-y 5 \\ 2 + a*R{x s a ,)} 



= \\\Kx- y f || 2 +aR(x). 



< liminf {\\\Kx- y 5 \\ 2 + a*R(x)\ 

<5— >0 ^ J 



(22) 



Hence £ is a minimizer of the functional J & , and by the uniqueness of the 
minimizer, x — x & . Since this holds for every subsequence, the whole sequence 
converges weakly. Moreover, by the weak lower semicontinuity, we have 

R(x s a .) -> R(x & ). (23) 

Next we show that x & is an R- minimizing solution to the equation Kx = y'. 
However, this follows directly from inequality (21) that H-Kxa — y'\\ = 0, and 
from inequality ( 22 ) 

R(xa) < R{x) Vx, 

which in particular by choosing x in the set of _R-minimizing solutions shows 
the claim. Now we deduce that 

hni£>(x*,,xt) = lim (fl(a£.) - R(x*) - (^x s a , {yS) - xt)) = 0. 



by observing identity (23) and the weak convergence of the sequence (x a ,^ yS ^)s 
to x* . This concludes the proof of the theorem. □ 



Remark 3.6. In Theorem 3.5. the uniqueness assumption on the functional J a 
can be relaxed as equation ( 23 1 holds for each weakly convergent subsequence. 
We have utilized the uniqueness of R-minimizing solution, which may also be 
dropped by restating the result as: then there exists some R-minimizing solution 
x\ such that 

D{xi«,x^) ^ for 5^0. 



In our context we are able to further weaken the assumption ( 19 ) on the 
noise. 



Corollary 3.7. If there exists an e e]0, 1[ such that for all z 6 K(domdR) the 
following inequality holds 

(y s -y\z) <(l- e )||/- y t|| N |. 

then the minimizer a* to 4>(a) is positive. Moreover, under the conditions of 
Theorem \3.1\ there holds 

D(x 5 a „ ,x*)<c(l+ 1 _ (1 1 _ e)2 ) max(5, 5*), 
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and under the conditions of Theorem\S. 

<c(i 



\xt,-x^\ 



there holds 
1 



max(<5, 8*), 



1- (1-6)2, 

Proof. By observing the fact that both x 5 a and x^ are in domdR and the as- 
sumption on the noise — y s , we derive 



\Kxi 



s \\ 2 = \\K{x s a ~J)-{y s -rf)\\ 2 

\\K(x s a - xt)|| 2 - 2{K(x s a - x)J - yt) + \\y 5 -tff 



> \\K(x s a - *t)|| 2 _ 2(1 _ e )\\ K (x s a - xi)\\ \\y s - yt|| + ||/ - tff 
= (\\K(xi - xt)|| - (1 - e)\\y S - y*\\) 2 + (1 - (1 - ef)\\y S - y*f 
>(l-(l-ef)S 2 . 

This in particular implies (S*) 2 > (1 — (1 — e) 2 )<5 2 and consequently that 0(a) — > 
+oo as a —¥ 0. Therefore, there exists a positive a* minimizing 0(a) over 
[0, ||^|| 2 ]- The remaining assertion follows similar to the proof of Theorem 



3.5 



□ 



Remark 3.8 (Comparing the assumptions on the noise). In Corollary 3.4 
Theorem \3.5\ we assumed 

3ee]0,l[VJ>0: \\Qtf-v B )\\ >e\\v*-y B \\ 



which is, with P denoting the orthogonal projector onto range K , equivalent to 
3e' >]0,1[V£>0: ||P(yt - y*)|| < e'\\^ - y 5 \\. (24) 
In Corollary \3. 7| we assumed 

' \4 



3e" €]0, 1[V<5 > OVz e K(domdR) : (y 6 



I) 



z) < e"\\y 5 



y 



(25) 



In the case dom dR = X we conclude from this assumption that ( 24 ) holds with 
e' = yl — e" . Hence, in this case (25) implies (24). However, if domdR is 
strictly contained in X condition ( 25 ) may be considerably weaker. 



4 The quasi-optimality principle 

In this part, we derive another error-estimate based heuristic choice rule, i.e., 
the quasi-optimality principle, and discuss its convergence properties. The mo- 



tivation of the principle is as follows: By Proposition 2.6 for any q e]0, 1[, there 
holds 

D(x s qa ,x s a )<^^<P(a). 

In particular, for a geometrically decreasing sequence of regularization parame- 
ters, the Bregman distances of two consecutive regularized solutions are bounded 
from above by a constant times the estimator 0. This suggests itself a parame- 
ter choice rule which resembles the classical quasi-optimality criterion OH] . 
More precisely, for given data y s and q G]0, 1[ we define a quasi-optimality se- 
quence as 

Mfc = D{x 5 qh ,x 5 qk - 1 ). 

The quasi-optimality principle consists of choosing the regularization parameter 
a qo = q k such that fif. is minimal over a given range k > fco- 
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Remark 4.1. The classical quasi- optimality principle as e.g., stated in \35Y 

1 5 

\3$ , chooses a qo such that the quantity \\ot-^\\ is minimal. In our setting this 
approach seems not applicable since the mapping a x s a is in general not 
differentiate. For instance, in the case of I regularization, the solution path, 
i.e., x s a with respect to a, is piecewise linear 11 6) /. Hence we resort to the discrete 
version which is also used in J1J/. 

We shall follow closely the lines of reference [T5] and start with some basic 
observations of the quasi-optimality sequence. The quasi-optimality sequence 
for the exact data will be denoted by 



Lemma 4.2. Let the source condition (|3| be satisfied. Then the quasi-optimality 
sequences {Hk)k an d {fA)k fulfill 



1. lim fi k = 0, 

k— oo 

2. lim fj/ k = and lim fjJ k = 0. 



Proof. Appealing to estimate (13), we have 

(1 - qf \\Kx s qh _, - y s f 



Hk < 



2q q 



k-1 



Since the sequence (^\\Kx 5 qk _ 1 — y 5 \\j stays bounded for k — > — oo, see Lemma 
|2.5[ the first claim follows directly. Setting 5 = in the above argumentation 
shows the first statement of Claim 2. Now we use inequality Q and estimate 



M t< (i- 9 ) 2 ll^-;-^ll% 2(1 _ g)2||H|V -, 

z q 



This shows the second statement of Claim 2. □ 

Now we show that the quasi-optimality sequences for exact and noisy data 
approximate each other for vanishing noise level. 

Lemma 4.3. Let the source condition ^ be satisfied. Then for any k\ 6 Z, 
there holds 

lim sup lAfrfc — Mfcl = °- 

o->0fc< fel 

Proof. We will use the abbreviations x s k — x 5 qk , x k — x q h, £ k = —K*{Kx k — y s ) 
and £fc = — K* (Kx k — y) to simplify the notation. By the definition of \i k and 
/A, we have 

W - Mill = \ D ( x i,x S k-i) ~ D(x k ,x k -i)\ 

= \R(xi)-R(x s k _ 1 )-(Z s k _ 1 ,xi-x s k _ 1 ) 

- R(xk) + R(xk-i) + (£,k-i,x k - Xk-i) | 

= \D{xi,x k )-D{a&_ lt x k -x) 

+ - &-x,a& ~ 4-i) + (& - Zk-u4 -x k )\. 
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Now we estimate all four terms separately. Using inequality ([6| we can bound 
the first two terms by 

D(xi,x k )<^, D(z£_i>s*-i)<2^=T- 
For the third term, we get from estimates ^ and ( 13 ) 

- t£_ lt 4 - = (K( Xk -i - 4-i) + (y s - y f ), ^(4 - **-i)>/9* _1 

< {\\K{^-i - xtJW + S)\\K(x 5 k - xi_ 1 )\\/q k - 1 

< 6S(l^q)(S + 2q k \\w\\)/q k - 1 . 

Similarly, we can estimate the last term by 

(6-^-1,4-^) <8S(l-q)\\w\\/q. 

Hence, all four terms are bounded for k < k% and decrease to zero as 5 —> 0. 
This proves the claim. □ 

In general, the quasi-optimality sequences {^k)k an( i (Mfc)fe can vanish for 
finite indices k. Fortunately, their positivity can be guaranteed for a class of 
functionals R. 

Lemma 4.4. Let the functional R be p-convex, R(x) = only for x = and 
satisfy that for any x tha value (£,,x) is independent of the choice of £ € dR(x). 
If the data y^ (resp. y s ) admits nonzero a* for which x a > 7^ 0, then [A > 
( resp. fik > 0) for all k > [In a*/ In q] . 

Proof. By the optimality condition for x a , we have 

-K*{Kx a -yt) £ adR(x a ). 

By assumption, the value (S, a ,x a ) is independent of the choice of £ a € dR{x a ) 
and hence, taking duality pairing with x a gives for any £ a 6 dR{x a ) 

(Kx a ,Kx a - ?/) + a(£ Q , x a ) = 0. 

For non-zero a; a we have that (£ a , a; a ) is non-zero and hence, we get 

^ = (Kx a ,Kx a - y f ) 

Next by the assumption that the data y^ admits nonzero a* for which x a * ^ 0, 
then for any a < a* , cannot be a minimizer of the Tikhonov functional. To 
see this, we assume that is a minimizer, i.e., 

^Il^ll 2 = l\\K0-rf\\ 2 + aR(0) < \\\Kx a * - y^\\ 2 + a*R(x a ,), 

by the strict positivity of R for nonzero x. This contradicts the minimality of 
x a *. Now let olxi Q<2 < ct* be distinct. Then both sets {i ai } and {x a2 } contain 
no zero element. Next we show that the two sets are disjointed. Assume that 
and {x a2 } intersects nontrivially, i.e., there exists some nonzero x such 
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that x € {^ai} H {x a2 }. Then by equation (26) and choosing any £ <G dR(x), 
we have 

(ifi;, — yt\ 
ai = = — = a?, 

which is in contradiction with the distinctness of ai and a2- Therefore, for 
distinct ot\,ot2 < a* , the sets {x Ql } and {x a2 } are disjointed. Consequently, we 
have 

\\x ai - x a J\ > 0. 
Now by the p-convexity of R, we deduce for q k < a* that 

= D(x q k,x q k-t) > C\\x q k — x q k-i\\ p > 0, 

which shows the assertion for /it. The claim for fif. can be shown similarly □ 

Remark 4.5. The assumptions on R in Lemma \4-4\ are satisfied for many 
commonly used regularization functionals, e.g., \\x\\tp, \\x\\lp with p > 1 and the 
elastic-net functional J£7| /. However, the special case of \\x\\gi is not covered. 
Indeed, the t minimization can retrieve the support of the exact solution for 
sufficiently small noise level S and a, see ]36$. Consequently, both fif. and 
vanish for sufficiently large k, due to the lack of p-convexity. The bound a* 
depends on y(y s ), and for nonvanishing y(y s ) can be either positive or +oo, see 
J12J/ for some discussions. The choice of kg should be related to a* such that 
Hk (n\ ) is nonzero. 

By combining the above two lemmas, we have the following important corol- 
lary, which will play a key role in establishing the convergence result. 



Corollary 4.6. Under the conditions of Lemma \4.4[ the parameter a qo chosen 
by the quasi- optimality principle satisfies that for any sequence 5 n there 
holds that a qo ->■ 0. 

Proof. By definition it holds that a qo = q k where k* is such that the sequence 
fj,k is minimal. 

there 



4.2 



Observe that ^ < A 1 ! + l^fe — Mfcl- Now, let e > 0. Due to Lemma 
holds that /ij, — > for k — > oo and hence, there exists an integer k such that 



m! ^ e /2- Moreover, due to Lemma 4.3 for any ki there is S > such that 



/ifc — A^-fc I — e/2forallfc < k\ , in particular with k. Hence fik < mI+ImI - Mfcl < £ 



for the same v alue of k. 



By Lemma 4.4 for any finite integer k±, the set {^\}j} =k is finite and posi- 



tive, and thus there exists a constant a > such that /ij. > a for k = fco, • • • , k\- 



Lemma 4.3 indicates that fj.^ is larger than a/2 for k = ko, . . . , ki and suffi- 
ciently small S. Thus the sequence (a qo )$ n can contain terms on {q k }'l 1 =ko only 
if S is not too small, since /ik goes to zero as S tends to zero. Since k\ is chosen 
arbitrarily, this implies the desired assertion. □ 

As remarked earlier, it is in general impossible to show the convergence of 
x a x ^ f° r a heuristic parameter choice in the context of worst-case scenario 
analysis. For the quasi-optimality principle, Glasko et al [19] defined the notion 
of auto-regularizable set as a condition on the exact as well as noisy data. In 
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the case of the continuous quasi-optimality principle this is the set of y s such 
that 

" ,1 g _ d * >q>0 

\\ x a x a\\ 

holds uniformly in a and 8. This abstract condition on the exact data has been 
replaced by a condition on the noise in [2]. 

In our setting, the following sets are helpful for proving convergence. 

Definition 4.7. For r > 0, q £]0, 1[, K : X — > Y and £ ranged we define 
the sets 

V r = {y s eY\Vk: \D(x 5 qk , x 5 qk ^) - D(x qk , > r£>(a;J»,s,*)}. 

The condition y 5 £ 2? r can be regarded as a discrete analogue of the above- 
mentioned auto-regularizable condition. With the set D r at hand, we can 
now show another result on the asymptotic behavior of the quasi-optimality 
sequence. The condition is that the noisy data belongs to some set T> r . 

Lemma 4.8. Let y s £ T> r for some r > and assume that R{x s a ) — » oo for a — > 
0. Then fik °o for k — > oo. 

Proof. We observe that 

rD(x qk ,x q k) < \D(x qk ,x qk -i) - D(x q k,x q k-i)\ = \fj, k - fi k \. (27) 

By the definition of the Bregman distance, ([5| and Q we have for £ Q = 
-K*[Kx a -yt)/a that 

D u (x s a ,x a ) = R(x 5 a ) - R(x a ) - (£ a ,x 5 a - x a ) 

= R(x s a ) - R(x a ) + ±{Kx a - y\K{x 5 a - x a )) 

> R{x 5 a ) - R{x a ) + A\\w\\5. 

Since R(x a ) is bounded for a — > we see that by assumption that D^ a [x 5 a , x a ) — > 
oo for a — > 0. This means that for k — > oo there holds that D(x s qk , x q k) — > oo 

and since [A — > 0, the claim follows from (27). □ 

Now we are in position to show the main result of this section, i.e., conver- 
gence for the quasi-optimality principle. 

Theorem 4.9. Let (5 n ) n , S n > 0, be a sequence converging to zero such that 
yS n _^ y\ g range if and y Sn £ V r for some r > 0. Let (a%° = a^ (y 5n )) n be the 
sequence of regularization parameters chosen by the quasi-optimality principle. 
Then 

lim D(x S %,X^) = 0. 



Proof. Denote by q k ™ . Then by using Corollary 2.4 we derive 

D(x q l n ,x j( ) < D(x q l n ,x q k n ) + D(x q h n) x^) +6\\w\\S n 

< -\D{x qkn ,x qkn -i) - D(x qkn ,x qkn -i)\ + D(x qkn ,x^) +6\\w\\S n 

= -\fJ>k„ - /4,J + D(x q k n ,x*) + 6\\w\\5 n . 

Now all three term s on the right hand side tend to zero for n —> oo (the first 
due to Lemma 4.3 and the second due to q kn = ajf — > by Corollary 4.6). □ 
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This theorem shows that it is possible that the quasi-optimality principle 
leads to convergence in the setting of convex variational regularization. However, 
the important question on how the sets T> r look like, and especially, under 
what circumstance they are non-empty, remains open. In |19[ [2] the authors 
use spectral theory to investigate this issue - a tool which is unfortunately 
unavailable in our general setting. 



5 Numerical experiments 

We conducted several experiments to illustrate our theoretical findings. 

5.1 Experiment 1: Accuracy of the estimates 

In the first experiment we show sharpness of the estimates of the approximation, 
data and total errors. Especially we illustrate how the function </> from the 
Hanke-Raus rule approximates the total error. 

The setting is as follows: We consider a deconvolution problem with sparsity 
constraints. In particular, the space X is a sequence space £ 2 and Y is the Hilbert 
space L 2 [0, 1]. The operator under consideration is K = AB : I 2 —> L 2 [0, 1] 
where A : L 2 [0, 1] — > L 2 [0, 1] is a circular convolution operator which convolves 
with a characteristic function of an interval of width 0.2 and B : I 2 — > L 2 [0, 1] 
is a Haar wavelet synthesis operator. Hence, the operator K takes a square 
summable sequence x, uses it as the expansion coefficients with respect to an 
orthonormal Haar wavelet basis and afterwards performs a circular convolution. 
The regularization function R is the || • \\i P norm, i.e., 

JiOr) = 5>*l p 

k 

which has, for p > 1, a single valued subgradient dR(x) = {p sign(x)\x\ p ~ 1 } . In 
particular we have chosen p = 1.2 to promote sparsity of the minimizers (cf. [T3]) 
and to get a p-convex functional simultaneously. To construct a solution x^ 
fulfilling the source condition ([3]), we started with a function w £ £ 2 [0, 1] and 
set £ = K*w. Then x> was defined as 

4 = si g n(a)ia/p| 1/(p - 1) . 

We discretized the problem to 512 wavelet coefficients. Figure [T] shows the 
chosen w £ L 2 [0,1], the function Bx^ £ L 2 [0, 1] and the exact data y> — 
ABx^ £ L 2 [0, 1]. Both vectors £ and X s consist of 165 non-zero coefficients, 
however, their plots are noninformative. 

For a fixed noise level 5 = 0.02, we generated noisy data y s such that 
\\y^ — y s \\ = S. Then we calculated minimizers x s a and x a of the Tikhonov func- 
tional with data y s and y\ respectively, for different values of a with the com- 
bined iterative hard- and soft-thresholding from [7] (see [S] for the iterative hard- 
thresholding algorithm and [M] |6] for the iterative soft-thresholding algorithm, 
the code is available at http: //www- public . tu-bs . de : 8080/~dirloren/progs/ 



iter_thresh.m). We calculated the different errors and the function 4> from the 
Hanke-Raus rule and show them in Figure [2j We observe that the function <f> 
captures the behavior of the total error very well. Moreover, the sum of the 
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Figure 1: Experiment 1: Left: w from the source condition. Middle: Bx^ . 
Right: y^ . 



approximation and data errors is close to the total error. Surprisingly, the esti- 
mate from Proposition |2.2| is even closer to the function <f> than the total error 
itself — a result which is not backed up by theory by now. 

Remark 5.1. The obtained results have been observed to be robust with respect 
to different noise realizations and different w (if the obtained sparsity of the 
corresponding x' is comparable) . 

5.2 Experiment 2: The Hanke-Raus rule 

In this experiment we illustrate the performance of the Hanke-Raus rule. We use 
the same set up as in the first experiment, i.e., the same x^ and K . For a range 
of 8 we generated noisy data y s and calculated the regularization parameter 
a HR with the Hanke-Raus rule of Section [3] in a brute- force manner: we tested 
values for a on a logarithmically uniform grid. As the exact solution x^ is 
known in this case, we also calculated the optimal regularization parameter a opt , 
i.e., the parameter a for which the error D{x s al x i ) is smallest, see Figure|3]for 
the results. R is observed that the Hanke-Raus parameter follows the optimal 
parameter closely in this example and accordingly the error of the Hanke-Raus 
rule is close to the optimal error. 

5.3 Experiment 3: The quasi-optimality principle 

This time the operator K , the data x^ and the regularization function R is again 
similar to Experiments 1 and 2. Here we analyze how the quasi-optimality 
principle from Section [4] performs in practice. We chose ao = 100 • 5 and 
q = 0.8. Then we calculated minimizers x s k for several values of k and 

1 q"a a 

chose a qo = q k ao as the one which minimized D(x s kag , x s k _ laa ). Again, we 
also calculated the optimal value a opt of the regularization parameter and the 
corresponding errors, see Figure [4] for the results. Again we observed that this 
choice follows the optimal regularization parameter closely and can produce 
accurate solutions. 

5.4 Experiment 4: Deblurring with elastic net 

In this experiment we used a standard problem from the Regularization Tools 
toolbox by P.C. Hansen [24 , namely the blur problem. We used the param- 
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Figure 2: Experiment 1: Illustration of the different errors in log-log scale. 




io -4 icr 3 icr 2 ict 1 * icr 4 icr 3 icr 2 io _1 $ 



Figure 3: Experiment 2: Left: The regularization parameter by the Hanke-Raus 
rule and the optimal parameter in dependence of 5. Right: The corresponding 
errors. 
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Figure 4: Experiment 3: Left: The regularization parameter by the quasi- 
optimality criterion and the optimal parameter in dependence of 5. Right: The 
corresponding errors. 



Table 1: Results for experiment 4. 
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smallest Bregman distance 
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10e 


-02 


5. 


54e 


-02 


1 


02e+01 


smallest norm 


3 


20e 


-03 


7. 


39e 


-02 


7 


36e+00 


Hanke-Raus 


2 


61e 


-03 


9. 


03e 


-02 


7 


47e+00 


quasi-optimality 


3 


02e 


-03 


7. 


51e 


-02 


7 


38e+00 


discrepancy 


9 


29e 


-04 


7. 


16e 


-01 


1 


01e+01 



eters N=50, baad=5, sigma=1.2 and employed the so-called elastic-net regular- 
ization [391 EZ] , that is a penalty term 

R{x) = \\x\\ 1 + r l\\x\\l 

On the one hand, this weighted sum of the one- and the two-norm can be seen 
as a stabilization for one-norm regularization and on the other hand, it leads to 
a kind of grouping effect, see also [3U1 157] , 

We generated a noisy image y s (with S = 0.1) and fixed r\ = 10~ 3 . We used 
a regularized semismooth Newton method (proposed in [3T] for the case r\ = 
and generalized to rj > in [27 J. Then we calculated solutions for a range of 
a and determined the regularization parameters according to the Hanke-Raus 
rule and the quasi-optimality criterion. Moreover, we calculated the parameter 
according to the discrepancy principle [31] (to compare with a non-heuristic 
a-posteriori rule) and the optimal regularization parameter with respect to the 
norm and the Bregman distance. We report the results in Table [T] and Figure [5] 

We observe that all rules produce reasonable results and perform compara- 
bly in terms of visual inspection. However, the numbers say a little bit more: 
The discrepancy principle chooses a parameter which is a bit too small and 
leads to larger errors both in terms of the Bregman distance and the norm. The 
Hanke-Raus rule and the quasi-optimality principle choose comparable parame- 
ters while the quasi-optimality principle performs slightly better. Moreover, the 
errors by the two proposed rules agree excellently with the optimal one both in 
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smallest Bregman distance smallest norm 




HR-rule quasi-optimality discrepancy principle 

Figure 5: Results for the blur problem for the Hanke-Raus rule and the quasi- 
optimality criterion. 
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terms the Bregman distance and norm. 



6 Conclusion 

We have derived two error estimate-based heuristic parameter choice rules for 
general convex variational regularization on the basis of a refined analysis of 
the regularization process. These rules reproduce the Hanke-Raus rule and 
the quasi-optimality criterion for the conventional quadratic regularization. A 
posteriori error estimates have been derived for the Hanke-Raus rule using the 
Bregman distance. The convergence of both rules are discussed by imposing 
conditions on the noisy data. Numerical results have verified some theoreti- 
cal findings and showed the effectiveness of these rules. An important future 
research problem is to develop efficient algorithms to numerically realize these 
rules. This is nontrivial because the functionals under consideration arc often 
nonsmooth and there exists only an implicit relation between the solution x 5 a 
and the regularization parameter a. 
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